vocabulary test
Towards an automatic method for generating topical vocabulary test forms for specific reading passages
Flor, Michael, Wang, Zuowei, Deane, Paul, O'Reilly, Tenaha
Background knowledge is typically needed for successful comprehension of topical and domain specific reading passages, such as in the STEM domain. However, there are few automated measures of student knowledge that can be readily deployed and scored in time to make predictions on whether a given student will likely be able to understand a specific content area text. In this paper, we present our effort in developing K-tool, an automated system for generating topical vocabulary tests that measure students' background knowledge related to a specific text. The system automatically detects the topic of a given text and produces topical vocabulary items based on their relationship with the topic. This information is used to automatically generate background knowledge forms that contain words that are highly related to the topic and words that share similar features but do not share high associations to the topic. Prior research indicates that performance on such tasks can help determine whether a student is likely to understand a particular text based on their knowledge state. The described system is intended for use with middle and high school student population of native speakers of English. It is designed to handle single reading passages and is not dependent on any corpus or text collection. In this paper, we describe the system architecture and present an initial evaluation of the system outputs.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (12 more...)
ISSR: Iterative Selection with Self-Review for Vocabulary Test Distractor Generation
Vocabulary acquisition is essential to second language learning, as it underpins all core language skills. Accurate vocabulary assessment is particularly important in standardized exams, where test items evaluate learners' comprehension and contextual use of words. Previous research has explored methods for generating distractors to aid in the design of English vocabulary tests. However, current approaches often rely on lexical databases or predefined rules, and frequently produce distractors that risk invalidating the question by introducing multiple correct options. In this study, we focus on English vocabulary questions from Taiwan's university entrance exams. We analyze student response distributions to gain insights into the characteristics of these test items and provide a reference for future research. Additionally, we identify key limitations in how large language models (LLMs) support teachers in generating distractors for vocabulary test design. To address these challenges, we propose the iterative selection with self-review (ISSR) framework, which makes use of a novel LLM-based self-review mechanism to ensure that the distractors remain valid while offering diverse options. Experimental results show that ISSR achieves promising performance in generating plausible distractors, and the self-review mechanism effectively filters out distractors that could invalidate the question.
- Asia > Taiwan (0.24)
- Europe > Ireland (0.05)
- North America > United States > Washington > King County > Seattle (0.04)
- (12 more...)
- Education > Educational Setting (1.00)
- Education > Curriculum > Subject-Specific Education (0.66)
Establishing Vocabulary Tests as a Benchmark for Evaluating Large Language Models
Martínez, Gonzalo, Conde, Javier, Merino-Gómez, Elena, Bermúdez-Margaretto, Beatriz, Hernández, José Alberto, Reviriego, Pedro, Brysbaert, Marc
Vocabulary tests, once a cornerstone of language modeling evaluation, have been largely overlooked in the current landscape of Large Language Models (LLMs) like Llama, Mistral, and GPT. While most LLM evaluation benchmarks focus on specific tasks or domain-specific knowledge, they often neglect the fundamental linguistic aspects of language understanding and production. In this paper, we advocate for the revival of vocabulary tests as a valuable tool for assessing LLM performance. We evaluate seven LLMs using two vocabulary test formats across two languages and uncover surprising gaps in their lexical knowledge. These findings shed light on the intricacies of LLM word representations, their learning mechanisms, and performance variations across models and languages. Moreover, the ability to automatically generate and perform vocabulary tests offers new opportunities to expand the approach and provide a more complete picture of LLMs' language skills.
- North America > United States > Colorado (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- Africa > Kenya > Mandera County > Mandera (0.04)
- (3 more...)
Around the world in 60 words: A generative vocabulary test for online research
van Rijn, Pol, Sun, Yue, Lee, Harin, Marjieh, Raja, Sucholutsky, Ilia, Lanzarini, Francesca, André, Elisabeth, Jacoby, Nori
Conducting experiments with diverse participants in their native languages can uncover insights into culture, cognition, and language that may not be revealed otherwise. However, conducting these experiments online makes it difficult to validate self-reported language proficiency. Furthermore, existing proficiency tests are small and cover only a few languages. We present an automated pipeline to generate vocabulary tests using text from Wikipedia. Our pipeline samples rare nouns and creates pseudowords with the same low-level statistics. Six behavioral experiments (N=236) in six countries and eight languages show that (a) our test can distinguish between native speakers of closely related languages, (b) the test is reliable ($r=0.82$), and (c) performance strongly correlates with existing tests (LexTale) and self-reports. We further show that test accuracy is negatively correlated with the linguistic distance between the tested and the native language. Our test, available in eight languages, can easily be extended to other languages.
- Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
- North America > United States > New Jersey (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (7 more...)
- Health & Medicine (0.68)
- Education (0.67)